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Abstract 

Motivated by the communication through a network employing linear network coding, linear operator channels 
(LOCs) over finite fields are studied with arbitrarily distributed transfer matrices. Some intrinsic symmetric properties 
of LOCs are revealed and are used to simplify transition matrix computation and input distribution optimization. 
Subspace coding for LOCs is studied with the help of the symmetric properties. Our results demonstrate that using 
constant-dimensional subspace coding are good enough for many typical parameters. For LOCs satisfying certain 
constraints, the optimal subspace coding is constant-dimensional. Simple method is derived to find an optimal constant- 
dimensional input distribution, as well as the maximum achievable rate using constant-dimensional subspace coding. 

Index Terms 

linear operator channel, linear network coding, subspace coding 

I. Introduction 

A linear operator channel (LOC), also called multiplicative matrix channel, with input X and output Y is given 

by 

Y = XH, 

where X e F TxM , Y 6 F TxW and H e W MxN are matrices over a finite field F with q elements. H is called the 
transfer matrix of the channel. We assume the noncoherent transmission that the instances of H are not a priori for 
either the transmitter or the receiver. 

A LOC models the communication through a network employing linear network coding Q~), Q . Koetter and 
Kschischang O observed that the vector space spanned by the column vectors in Y is always a subspace of 
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the vector space spanned by the column vectors in X, and proposed to modulate information by subspaces for 
communication through LOCs, which is called subspace coding. They explored the subspace coding problem for 
one use of LOCs Q. Thereafter, subspace coding generated a lot of research interests, but there are still some 
fundamental things about subspaces coding and LOCs unclear. For example, what is the relation between LOCs 
and subspace channels, and when is subspace coding optimal for LOCs? Generally, we may want to know how to 
design subspace codes for multiple uses of the channel. 

Towards better understanding the coding problem for linear network coding, a systematic study of LOCs becomes 
necessary. Existing works study special distributions of H. When M = N, Silva et al. flU studied that H are 
uniformly chosen from all full rank M x M matrices. Jafari et al. studied that H contains uniformly i.i.d. 
components. However, in typical network coding applications, the transfer matrix H is jointly determined by the 
dynamics of the network topology, the packet dropping pattern, the randomness in linear network coding J6l and 
other random factors in the network transmission. So, H can have rank deficiency and correlated components, and 
the distribution of H is hard to be determined. Though studying a specific distribution could get deeper results, it 
is arguable how these results can be applied to other distributions of H. 

In this paper we focus on the general properties of a LOC with an arbitrary distribution of H. It is well known 
that if T is much larger than M, parts of X can be used to transmit a known matrix such that the receiver can 
recover the instances of H. Such a scheme has been widely used for random linear network coding |6| and is 
asymptotically optimal when T goes to infinity. Here we are interested in the case with mild T, where noncoherent 
transmission becomes meaningful. Our results, unless otherwise specified, work for general parameters T, M, N 
and q. 

In the first part of this paper, we investigate the symmetric properties of LOCs which hold for any distribution 
of H. The input and output of a LOC are matrices and have higher dimensions of freedom than those of channels 
like BSC and BEC. For example, given the distribution of H, directly computing the transition matrix of a LOC 
has a complexity 0(q TM+MN ). We show that using some intrinsic symmetric properties of LOCs, we can reduce 
the complexity of computing the transition matrix to 0(min{M, T}q( t ) ). 

Moreover, optimizing the mutual information is usually not easy since the number of probability masses expo- 
nentially increases with the dimensions of input matrices. An input distribution is said to be a-type if two input 
matrices spanning the same vector space by their rows are equiprobable. We show that there exists an a-type 
input that achieves the capacity of any LOCs. An a-type distribution is equivalent to a distribution over the set of 
subspaces of F M with dimension less than or equal to min{Af, T}. Thus to find an optimal a-type distribution has 
much less number of variables to fix than to find an optimal distribution over ¥ TxM . 

In the second part of this paper, we study subspace coding for LOCs with the help of the symmetric properties. 
A subspace degradation of a LOC is defined as the subspace channel induced by the LOC with a given transition 
probability from subspaces to matrices. We say that a LOC is uniform if the transition probability of the channel 
only depends on the subspaces spanned by the column vectors of the input and output matrices. We show that a 
uniform LOC has a unique subspace degradation and the subspace degradation can achieve the same rate as the 
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LOC. 

Our results demonstrate that using constant-dimensional subspace coding may suffice for many scenarios. We 
show that the gap between the maximum achievable rate of constant-dimensional subspace coding and the maximum 
achievable rate of subspace coding is at most the maximum information can be transmitted using input and output 
ranks, which is less than log 2 min{T, M, N} bit per use. This gap is marginal for typical parameters and diminishing 
when either q or T goes to infinity. We also show that the optimal subspace coding is constant-dimensional for 
LOCs with i) the rank of H has positive probability to be any integer from to M and ii) sufficiently large T. 

We derive a linear programming to find an optimal constant-dimensional input distribution, as well as the 
maximum achievable rate using constant-dimensional inputs. In the general case the complexity of this linear 
programming is linear with the number of subspaces in ¥ M with dimension less than or equal to min{ M,T}. 
When T is sufficiently large, the optimal dimension is at least the largest rank of H with nonzero probability. For 
uniform LOCs, the complexity can be reduced to min{T, M}. 

Parts of the results of this paper have appeared in our conference paper [7] and online in (8). A random matrix 
is said to be uniform (for a given rank) if the instances of a same rank are equiprobable. Recently, Nobrega et al. 
||9l studied LOCs with uniform transfer matrices. A LOC with a uniform transfer matrix is always a uniform LOC 
defined in this paper, but a uniform LOC may have a non-uniform transfer matrix. We will make it clear by an 
example in this paper. 

The rest of this paper is organized as follows. After introducing some notations, we formally define a LOC 
in Section [Til] In Section ITVl we introduce some symmetric properties and show how these symmetric properties 
simplify the study of LOCs. Subspace degradations of a LOC are introduced in Section [VJ where we also discuss 
uniform LOCs. The mutual information decomposition of subspace degradations in Section [VT] is an important 
result of the symmetric properties. Based on this decomposition, we obtain the results about constant-dimensional 
subspace coding in Section I VIII Uniform LOCs are further discussed in Section IVIIII 

II. Preliminaries 

Let F be the finite field with q elements, F* be the t-dimensional vector space over F, and F txm be the set of all 
t X m matrices over F. For a matrix X, let rk(X) be its rank, let X T be its transpose, and let (X) be its column 
space, the subspace spanned by the column vectors of X. Similarly, the row space of X is denoted by (X T ). If V 
is a subspace of U, we write V < U. 

For a discrete random variable X, we use px to denote its probability mass function (PMF). For random variable 
X and Y defined on discrete alphabets X and y, respectively, we write a transition probability (matrix) from X 
to y as Py|x(X|Y), X e X and Y e y. When the context is clear, we may omit the subscript of px and Py\x 
to simplify the notations. 
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III. Linear Operator Channels 
Let T, M and N be positive integers. A LOC with input X e¥ TxM and output Y G ¥ TxN is given by 

Y = XH, (1) 

where i/, namely the transfer matrix, is a random matrix distributed over F MxAr . Such a LOC is denoted by 
LOC(H, T). For one use of LOC(H, T), we mean the channel transmits one T x M matrix. 

A communication network employing linear network coding can be modeled by a LOC. The source node encodes 
its message into batches (also called generations, classes or chunks), each of which containing M packets of T 
symbols iflOl . ifTTl . Network nodes perform linear network coding among the symbols in the same position of the 
packages in one batch, and the coding coefficients for all the positions are the same. This packetized transmission 
matches our assumption that the transformation matrix keeps constant for T positions of the packets. 

In a general situation, the number of packets received of batch is also a random variable. Here we use a fixed 
number N of column of Y because i) receiving unbounded number of packets for a batch does not make sense in 
practice, so we can put a bound N for the number of packets that can be received for a batch; ii) when the number 
of received packets is smaller than N, we can always make the number of received packets to be N by padding 
all-zero columns into Y. 

We assume that H and X are independent. Under this assumption, the transition probability Py\x (Y|X) is given 

by 

/Vpf(Y|X)=Pr{Xff = Y}. (2) 
A LOC is a discrete memoryless channel (DMC). The capacity of LOC(i/, T) is 

C(H,T) = max/(X;F). 

Px 

Achieving the capacity generally involves multiple uses of the channel. A block code for LOC(H, T) is a subset 
of (F TxA/ ) n , the nth Cartesian power of F TxA/ . Here n is the length of the block code. Since the components of 
codewords are matrices, such a code is called a matrix code. The channel capacity of a LOC can be approached 
by a sequence of matrix codes with n — > oo. 

A. Markov Chains 

Let X be a random variable over F txm . Let Pj(F*) be the collection of all subspaces of F*. Let (X) be a random 
variable over Pj(F*) with 

p {x) (U)=Pr{(X) = U}= < 3 > 

XgF txm :(X) = (7 

Denote X T as a random variable over F mxt with p x t(X. t ) = px(X). Combining the above notations, (X T ) is 
a random variable over Pj(F m ) with 

P{xt)(V) = Y Px(X). 
xeF txm :(x T )=v 
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rk(X) 



(X T ) 



(X) 



X^Y 



(Y T ) 



(Y) 



rk(Y) 



Fig. 1. Random variables and Markov chains related to hOC(H, T). 



Furthermore, denote rk(X) as a random variable with 

Rk(X)(r)= ]T P*W- < 4 > 

X:rk(X)=r 

It is easy to see that rk(X) is a deterministic function of (X) ((X T )), and (X) ((Y T )) is a deterministic function 
of X. 

Now we consider LOC(H,T) where H has dimension M x N. Applying the above definitions on the input 
X and the output Y, we obtain the relation between random variables shown in Fig. Q] These random variables 
are given as the nodes of a directed graph. All the random variables in a directed path form a Markov chain. For 
example, rk(X) — > (X) — > X — > Y — > (Y) — > rk(Y) forms a Markov chain. Let r, U, X, Y, V and s be the 
instances of ik(X), (X), X, Y, (Y) and rk(Y), respectively. To verify this Markov chain, we only need to check 
the deterministic relations between these random variables: 

P (X, Y) if (X) = U, dim(L0 = r, 
in r. I : X. Y.V.s) = <J <Y> = V, dim(V) = a, 

o.w., 



P (X ){U) if dim(C7) = r, 
o.w., 



P{Y) (V) ifdim(V) = S , 
o.w. 



PA(x)(x){r, U) ■■ 

and 

P(Y)rk(Y)(V, S) -- 

Using the above relations, we are ready to see 

p(r, U, X, Y, V, S )p(U)p(X)p(Y)p(V) 
= P (r, U)p(U, X)p(X, Y)p(Y, V)p(V, s), 

which matches an alternative definition of Markov chain given in [12 Section 2.1]. Other Markov chains shown 
in Fig. Q]can be verified accordingly. 

IV. Symmetric Properties and Some Applications 

We first state an intrinsic symmetric property of any LOCs, which induces other symmetric properties of LOCs 
used in this paper. A matrix is said to have full column (row) rank if its rank is equal to its number of columns 
(rows). 
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Theorem 1: For LOC(H,T), if X = BD and Y = BE, where B has full column rank, then 

iVpr(Y|X) = Pr{XH = Y} = Pt{T>H = E}. 

Proof: The theorem follows from P Y \ X {Y\X) = Pr{BD# = BE} = Pr{DH = E}, where the last equality 
follows because B has full column rank. ■ 

A. Computation of Transition Matrix 

The matrix of transition probabilities is also called the transition matrix. Computing the transition matrix using 
(O has a complexity Q(q™ +MN ) since we have q™ choices of X and for each X, there are at most q MN 
choices of Y such that P Y \x (Y|X) ^ 0. We can use Theorem[TJto reduce the complexity. 

Let B be a t x r matrix with rank r. For a t x m matrix A with (A) C (B), define A/B to be a matrix such 
that A = B(A/B). The notation "/" is well defined because i) there always exists C such that A = BC since 
(A) C (B) and ii) such C is unique since B is full column rank. 

Corollary 2: Let X and Y be the input and output matrices of LOC(H, T), respectively, with (Y) C (X). Fix 
a full column rank matrix B with (X) = (B). Then, 

P Y \x(Y\X) = Pr{(X/B)H = Y/B}. (5) 

Proof: Since X = B(X/B) and Y = B(Y/B), the result follows from TheoremQ] ■ 
In Corollary E the dimension of X/B is rk(X) x M and the dimension of Y/B is rk(X) x N. Since rk(X) < M 
and X/B has full row rank, the computation of the transition matrix of LOC(H,T) can be reduced to compute 

Pr{DH = E}, for all D e Fr(F fexM ), k = 0, 1, . . . , mm{M, T}, (6) 

where Fr(F fexM ) denotes the set of full rank matrices in F fexM . For a fixed k, the number of D needed to be 
considered is | Fr(F fexM )|, which is given by the number xt 1 defined in d33l l (see Lemma [8] in Appendix [A}. 
We can further simplify the computation. For any D £ Fr(F MxM ), by Corollary 12 

Pr{Dff = E} = Pr{H = ED" 1 }. 

In other words, we only need to consider one matrix in Fr(F MxM ). The following symmetric property summarizes 
this observation. 

Theorem 3: Consider a LOC{H,T) where H has dimension M x N. For k < M, D X ,D 2 £ Fr(F fexM ), if 
(Dj) = (Dj), i.e., the row spaces of Di and Di are the same, the vector (Pr{DiiJ = E} : E £ W kxN ) is a 
permutation of the vector (Pr{D 2 i? = E} : E £ ¥ kxN ). 

Proof: We only need to show that there exists a bijection / : F fexW — > W kxN such that PrjDiiJ = E} = 
Pr{D 2 H = /(E)}. Since Di,D 2 £ Fr(F fexM ) and (D^) = (Dj), there exists a unique full rank matrix T such 
that D 2 = TDi. Define / : ¥ kxN -> ¥ kxN as /(E) = TE. Since T is full rank, / is a bijection. The proof is 
completed by Pr{D 2 # = /(E)} = Pr{D 2 H = TE} = Pr{T- 1 D 2 i? = E} = Pr{DiiJ = E}. ■ 
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Let Pj(m,F*) be the subset of Pj(F') that contains all the subspaces with dimension less than or equal 
to m. By Theorem |3] for each subspace in Pj(min{T, M}, F M ), we only need to choose one D to compute 
(Pr{Di7 = E},E 6 ¥ kxN ). The Grassmannian Gr(r, F*) is the set of all r-dimensional subspaces of F*. Thus 
Pj(min{T,A/},F M ) = \J k<mia{TM} Gr(k, F M ). By Lemma [Jo] in Appendix HI Gr(fc,F M ) = (k) q > where 
is called the Gaussian binomials. So the overall complexity of computing the transition matrix is 

' cM?'^) 2 , N < M,M <T 
cMq MN , N > M,M <T 



k=0 



cTq^r, T>M±N iM >T 
cTq (M+N-T) Tj j, < m±n^ m > T 



2 

where the c w 0.3427 is a constant (see Lemma [T4l in Appendix 151. 

B. a-type Input Distributions 

In general, accurately finding an optimal input distribution needs to determine q™ probability masses. Here we 
show that the problem can be reduced to find an optimal distribution over Pj(min{T, M}, ¥ M ). 

Definition 1: A PMF p over F TxM is a-type if p(X) = p(X') for all X, X' e F TxM with (X T ) = (X ,T ), i.e., 
the same row spaces. 

Lemma 1: A function p : ¥ TyM — > M. is an a-type PMF if and only if it can be written as 

p(X) = Q«X T ))/ X £ (X) (7) 

for certain PMF Q over Pj(min{M, T}, ¥ M ). 

Proof: Assume p is an a-type input. Define Q : Pj(min{M, T}, F M ) -> 1 as 

Q(U) = J2 

x'sf t * m :<x' T )=c/ 

For X e F TxM , 

g((x T )) = J2 p( x ') 

x'eF TxM :<x' T )=<x T ) 

=?(x) E 1 

x'eF TxM :<x' T )=<x T ) 
= M X )X*( X ) 5 

where the last equality follows from Lemma [T7] in Appendix [B] This proves the necessary condition. 

Now we prove the sufficient condition. Let Q be a PMF over Pj(min {M, T},¥ M ). Define a function p : F TxM -» 

R as 

p(X)=Q«X T »/ x £ (x) . 
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We can check that for X, X' e F TxM with (X T ) = (X /T ), 

P (X) = Q«x T ))/ x £ (x) 

= g((x' T ))/ x f k(x) 

= P(X'), 

and 



5>(x) = E E ^ 



Qim([7) 
C/ePj(F")X:(X T ) = £/ 



= E Q^)i^Ltm E 1 

C/ePj(F M ) X:(X T ) = (7 

= E Q@) 

= 1. 

Thus p is an a-type PMF. ■ 
Theorem 4: There exists an a-type input that maximizes I(X; Y) for any LOC, i.e., 

C(H,T)= max I(X;Y). 

p x :a-type 

Proof: This theorem is proved using Theorem [Hand the concavity of mutual information as a function of input 
distribution. See Section HV-CI for details. ■ 
Theorem |4] narrows down the range to find an optimal input. To determine a PMF over Pj(min{M, T}, ¥ M ), 
we have | Pj(min{Af,T},F M )| parameters to determine. We know | Pj(min{M, T}, ¥ M )\ = Efelo^'^ Ck)q < 
cmin{M,T}q M2 / 4 , where c « 3.4627 (see Lemma [l4l in Appendix H}. 



C. More Symmetric Properties 

The following lemma is used to prove Theorem [4] 

Lemma 2: Let p x be an input distribution of LOC(H, T) where H has dimension MxN. Define p' x : F TxM — > 
K as p' x (X) = px($X), where $ e Fr(F TxT ). We have, i) p' x is a PMF, ii) I(X;Y)\ Px = I{X;Y)\ p - x and iii) 
I({X);(Y))\ px =I((X);{Y))\ p , x . 

Proof: First p' x is a PMF because < p' x @Q = p($X) < 1 and 

E p$c(x)= E p( $x ) 

- E p( x ) 

xe$F TxM 

= E ^( x ) 

xeF T >< M 
= 1. 
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Let py and p' Y be the PMF of Y when the inputs are px and p' x , respectively. We have 

p' Y (Y)= ]T Px(X)Py,x(Y|X) 



XeF TxM 

(a) 



(b) 



p($X)Py| X ($Y|$X) 

xgf t * m 

E p(X')JV|x(*Y|X') 

x'gf TxM 



= py($Y), 

where (a) follows from Theorem Q] and p^(X) = px( < i > X), and (b) follows by letting X' = <£>X. Therefore, 

/(X; Y) \ p . x = J2 P'x (X) J2 P ( Y I X ) l0 S 2 ^7- Y|X ' 



x 

(<0 



E^X) E P(,Y|,X)lo g2 CT0 



x 



EMX')E^'|X')log 2 ™^ 



X' Y' 

= /(X;Y)| px , 

where (c) follows from Theorem Q] 

The last equality in the lemma can be proved similarly. First, 

P\x){U)= £ p'x(X) 

X:(X) = C7 

= £ px($X) 

X:(X)=t/ 

( => E ^( x ') 

X:(X)=$(7 

= P(x>(*C0, 

where (d) follows from Theorem Q] Let P'(Y)\(x)(y\U) be the transition probability when the input is p' x . For 
U <¥ T with p {X ){U) > 0, 

P<V>|<x>(m) 

_ J2x.Y:{x)=u.(Y)=v ^V|x(Y|X)px(X) 

= Ex,Y:(x),K(Y}.y Py|x(<frY|3>X) P x(<I>X) 

P<x)($^) 

= P<y)|(x>(*V|$tr). 



10 



Hence, 



Therefore, 



P'(Y)(V)=T, P {Y)\{X)(y\U)p'(X)(U) 

u 
u 

= P(Y)(®V). 



I((X);(Y))\ p , x 

u v (y) ^ ' 

- E p(x) (*eo E p{*v\*v) io g2 

= J({X);(y))U. 



Proof of Therem® Consider LOC (H,T). Let p be an optimal input distribution for the channel. For $ G 
Fr(F TxT ), define p* as p*(X) = _p($X). By Lemma|2] p*(X) also achieves the capacity of the LOC. Define p* 

as 

1 V n *eFr(F r >< T ) 

By the concavity of the mutual information, we know that p* is also an optimal input for the channel. 

Now we show that p* is a-type. Consider X, X' e f TxM with (X T ) = (X' T ). There always exists <f> <E 
Fr(F TxT ) such that X' = $ X (see Lemma[l6]in Appendix©. We have 

P*(*oX) = * E **(*oX) 

1 v yl *eFr(F T >< T ) 

|Fr(F TxT )| P ^ 

= P*(X), 

where in the last equality we use Fr(F TxT ) = $ Fr(F TxT ). ■ 

V. Subspace Degradations of LOCs 

Consider the Markov chain (X) — > X — > y — > (Y) related to LOC(if, T). The transition probability from X to 
Y is given by (f2]). The transition probability from Y to (Y) is deterministic: 

1, (Y) = V 
0, o.w. 

The transition probability Px\(x) is not determined by the LOC. 



P(y)\y(V\Y) 
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Definition 2: Consider LOC(H, T) with transition probability Py\x- Given a transition probability -Px|(x}> we 
have a new channel law given by P(y)\{x)(V\U)- This channel takes subspaces as input and output and is called 
the subspace degradation of LOC(H, T) with respect to (w.r.t) Px\(x)- This degradation is well defined since the 
transition probability Piy)\{x) is determined by the above Markov chain as 

P ( y)\(x)(V\U) = J2 p (Y)\x(V\X)P x]{x) (X\U) 

X 

= E E Py\x(Y\X)P x]{x) (X\U). (8) 

X:(X} = [/ Y:(Y)=V 

For a subspace degradation of LOC(H,T) w.r.t Px\(x)< me mutual information between (X) and (Y) can be 
written as a function of P(x) an d P(Y)\{x)> m which P(Y)\tx)> given in ©, is a function of -Px|(x) (X|[/). The 
capacity of a subspace degradation of a LOC is max P(x) I({Y); (X)). Therefore, the maximum achievable rate of 
subspace degradations of LOC(H,T) is 

C SS (H,T) = max maxI((X); (Y)). 

PX\(X) P(X> 

The rate Css(H,T) is achievable since max P(x) I((X); (Y)) is achievable for any given Px\(x)- 

Lemma 3: For LOC(H, T), I((X); (Y)) is determined by px. i- e -> we can treat I((X); (Y)) as a function of 
for a given LOC and write 

C SS (H,T)= max I({X};{Y}). 

Px 

Proof: We show that P(x)(U) and Px|(x)(X|i7) appeared in I((X); (Y)) are determined by px- First, we 
obtain pi X ) from as shown in Q. Second, since 

P X | W (X[C/> W («7) - Pr{X = X, (X) = U} 

Px (X) (X} = f/ 



we have 

Px(X) 



p w (C/)^0, (X) = C/ 



Px| ( x)(X|C/) = ^ ^ {u) (9) 
1 (X) u. 

That means, for U with P(x) (U) > 0, Px\(x) (X|t/) is determined by px- Moreover, if P(x) {U) = 0, Px"|(x) (X|f7) 
does not appear in I((X); (Y)). Thus, I((X)-, (Y)) can be regarded as a function with only one variable px- This 
also implies that 

C SS {H,T) >max/((X);(y}). 
One the other hand, given Pjcl(x) anc l P(x)^ we nave a PMF of X given by 

Px(X)=p w ({X))P XKX) (X|(X)), 

which establishes that 

C' SS {H,T) <max/(pO;(r)). 

Px 

The proof is completed. ■ 
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A. Uniform LOCs 

In general I(X: Y) > I((X)-, (Y)). We may want to know when the equality holds, under which using subspaces 
suffices. 

Definition 3: LOC(H,T) is uniform (for given input and output subspaces) if there exists a function /i : Pj(F T ) x 
Pj(F T ) -)• [0 1] such that 



Pr{Y = XF} = 



M«X),<Y» (Y)C(X) 



o.w. 

Theorem 5: A uniform LOC(H, T) where H has dimension M x N has a unique subspace degradation given 



by 



P(Y)\(x)(V\U) = v(U,V) X » miv) 



Moreover, for a uniform LOC, I(X; Y) = I((X); (Y)). 

Proof: See Section N-B\ ■ 
The number of M x N matrices with rank r is given by Xr defined in (l36l l (see Lemma [TT] in Appendix lAb. 
There is another property such that I(X; Y) = I((X); (Y)) holds. 

Definition 4: A random matrix H over F MxJV is uniform (for a given rank) if 

Pi-k(#)(rk(H)) 



PH(H) 



M,N 
Xrk(H) 



Theorem 6: Let H be a random matrix with dimension M x N, i) If LOC(_ff, T) is uniform and T > M, then 
is uniform, ii) If H is uniform, then LOC(i/, T) is uniform. 

Proo/: See Section IVTlLCl ■ 
Now we give a uniform LOC that has a non-uniform transfer matrix. Let if be a 2 x 2 random matrix over the 
binary field with 



Ph 



1 




= Ph 



1 

1 



= Ph 




1 



= Ph 



1 1 
1 1 



= 0.25. 



H is not uniform in the sense of Definition @] but we can verify that LOC (ii, 1) is uniform in the sense of 
Definition [3] 

B. Proof of Theorem [5] 

Consider a subspace degradation of LOC(ii, T) with P X \/x)(X-\U). First, 

Py| W (Y|f/)= £ P(Y|X)P(X|E0 

X:<X)=(7 

= ]T M^(Y))F(X|C7) 

X:<X)=(7 



<Y». 
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Then, 



P(Y)\(X)(V\U) = Yl Py\(x)(Y\U) 

Y:(Y>=V 



So this means that all Px\{x)(X-\U) give the same subspace degradation. 
Now we prove the rest part of the theorem. Let U = Pj(F T ). We have 

y,c/ew x,Y: J -* v v ; 

<x>=c/,<Y)=y 

, err t/m P{X)(Y)(U,V) 

-v¥J iXHY){ ] g2 n*m^v) (10) 

= I({X);(Y)), 

where (IToT > follows from the log-sum inequality. To prove this theorem, we only need to show the equality in 
([Tol l holds for uniform LOCs. We need to check that Py\x (Y|X)/py(Y) is a constant for all X and Y with 
(Y) = V < (X) = U < F T . Fix an input distribution px- Since the LOC is uniform, 

PY (Y)= Py\x(Y\X) Px (X) 

X:V<(X) 

U'<¥ T :V<U' X:(X>=£/' 

= »(U',V)p {x) (U'). 

U'<¥ T :V<U' 

Thus, 

P Y[X (Y\X) _ v(U,V) 



This verifies the equality in () holding. 

VI. Mutual Information Decomposition 

For subspace degradations, a-type input distributions are also useful. 

Theorem 7: There exists an a-type input that maximizes I((X); (Y)) for any LOC, i.e., 

C SS (H,T)= max I({X);{Y}). 

px :a-type 

Proof: This theorem can be proved similar to Theorem 0] by applying Lemma |2] 
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For a random matrix X, recall that rk(X) is the random variable representing the rank of X (see © for the 
PMF). Similar to Lemma[3] /(rk(X);rk(V)) is determined by px and Py\x- Define 



T 

Xs 



J(rk(X);rk(Y)) = ^ p rk (x) rk(r) (r, s) log 2 — 



= E 



T 

, ^rk(y) 
-*rk(Y) 



(ID 



where p r k(x)rk(Y)( r , s ) can be derived using px and Pyix- 
Theorem 8: For a LOC with a-type inputs, 

I((X); (Y» = J(rk(X);rk(Y)) + J(rk(X);rk(F)). (12) 

Proof: The proof is done by rewriting the formulation of mutual information using the symmetric property 
and the definition of a-type inputs. See Section IVI-Bl for details. ■ 
In (TT2l . I(ik(X); rk(Y)) is the mutual information of the ranks of transmitted and received matrices. In other 
words, it is the rate transmitted using the matrix ranks. The meaning of J(rk(X); ik(Y)) has an interpretation using 

T 

set packing. The capacity contributed by 7--dimensional transmissions and s-dimension receptions is log 2 ^f- = 
1°§2 (T)g/(s) g > where (^) is the total number of s-dimensional subspaces in F T , and (l) q is the total number of 
s-dimensional subspaces in an ?'-dimensional subspace. Treat an s-dimensional subspace in F T as a set element. 
An r-dimension transmission can be regarded as a collection of s dimensional subspaces that span it. Then, the 
maximum set packing problem is looking for the maximum number of pairwise disjoint collections of s-dimensional 
subspaces which have cardinality ( K ^) q and span an A/-dimensional subspace. 

One simple coding scheme of LOCs is to use part of X to recover the instance of H in the receiver. Such a 
scheme is referred as channel training and can only achieve rate (1 — M/T) E[rk(i7)] (see the analysis in lfl3ll '). 
Theorem [8] implies that using subspace coding can achieve a rate strictly higher than using channel training. 

Corollary 9: For LOC(#,T) where H has dimension M x N and T > M, 



C SS (H,T) >E 



T 



= (T - M) E[rk(ff)] log 2 q + e(T, q), (13) 

where < e(T, q) = ^2 s p r k(H){ s )^°S2 < l-^- This lower bound is achieved by the a-type input px with 

PMX)(M) = 1. 

Proof: See Section IVFBl ■ 
Remark: Note that this bound depends on the rank distribution of the transformation matrix. This lower bound is 
in fact tight for certain LOCs with sufficiently large T (see Theorem \l3[. 

A. A Useful Form 

Lemma 4: Let X be an input matrix of LOC (H, T). Then, 

P M Y)\x(s\X) = P rk(Y)KX T ) (s|(X T )) = Pr{rk(Dff) = s}, 
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where D is any rk(X) x M matrix with (D T ) = (X T ). 

Proof: See Section IVLB1 ■ 
We have a refined version of Lemma Q] 

Lemma 5: A function p : F TxM — > R is an a-type PMF if and only if it can be written as 

, v > D / , , v ^Qrk(X)((X )) 

p(X) = i?(rk(X)) (14) 

Xi-k(x) 

where Q r (-) is a PMF over Gr(r,F M ) and R(-) be a PMF over {0, 1, • • ■ , M}. 

Proof: If p can be written as (fl4l . by Lemma [T] p is an a-type PMF. On the other hand, if p is an a-type 
PMF, it can be written as ©. Let 

R(r) = J2 Q<P)- 

U:rk(U)=r 

For r such that R(r) > 0, let 

Q(U)/R{r) dim(U) = r 
o.w. 



Qr{U) = 



For r such that R{r) = 0, let Q r {-) be any PMF over Gr(r, F ). Since Q diro(&) (f/)-R(dim({7)) = Q(C7), we see 
that p can be written as (1141 1, ■ 
When using the formulation in (Ti"4"l i. I(rk(X); rk(y)) and J(rk(X); rk(F)) can be written as functions of Q r (U) 
and R(r) as follows. Using the property of Markov chain, 

Pik(Y)\tt(x)(s\r) 

E P i-k(y)K^ T >( S l^) F (^ T >|rk(^)(^l r ) 
C/GGr(r,F M ) 

= E P rk(F)|<XT)( S |J7)Q,.(^), (15) 
£/6Gr(r,F M ) 

in which -P r k(y)|(x T )( s l^)' given in Lemma|4] is a function of p# and is not related to Q r (U) and R(r). Thus, 
we can write 

/(rk(X);rk(y)) = £ *(r) £ P(a|r) log 2 ^-^L , (16) 
where P(s|r) = -P r k(y)| r k(X)( s l r ) is given in HS\ . On the other hand, 

J(rk(X);rk(y)) = ^ J R(r) £ Q r (U)g(U), 

r U£Gr(r,W M ) 

where 

^)^E P ^)l(^>(«l^)l0g 2 ^Ly d7) 

Note that g(U) depends on only the distribution of H but not the input. 
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B. Proofs 

Proof of Theorem® Fix an a-type input px- For V < U < F T with dim([/) = r and dim(V) = s, we first 
show 

P(X)(Y)(t>, V) = (T)~Q • (18) 

We only need to show that p {x){Y) (U,V) = P {X ){Y) (U\ V) for any V < U < ¥ T and V < U' < ¥ T with 
dim(J7) = dim(J7') and dim(V r ) = dim(V), because if this is true, 

Pvk(X)vk(Y)(r,s) = P{X)(Y)(U*,V*) 
dim((7*)=r,dim(y*)=s,y*c;7* 

= P(X)(Y)(U,V) 1 
dim([/*)=r,dim(V*)=s,V*C£/* 

■ T 

= V{x)(Y){U,V) 
Let 

A(m, U) = {X G F txm : (X) = U}. 
There exists $ e Fr(F TxT ) such that $[/ = U' and $V = V (see Lemma[l5]in AppendixEJ. Then, 

P(x H Y)(U,V)= ^( X ) E *V|x( Y l x ) 

XeA(Ai\t/) YeA{N,V) 

= M**) E *V|Jc(*Y|$X) (19) 

XeA(M,C/) YGA(W,y) 

^ px(X) £ P ^|x(Y|X) (20) 

= P<Jc><y)(*tf;*V0 
= P(x)(y)(^', V), 

where (O follows that p x is a-type (px(X) = px^X)) and Py| X ($Y|$X) = iY| X (Y|X) follows from 
Theorem Q] ( 1201 follows from &U) = &A(m, U) (see Lemma [P7|i. This proves ( II 81 , 
Applying the property of a-type inputs, 

P(x)(U)= J2 P*(X) 

XeA(M,C7) 

= ^ px($X) 

XeA(Af,!7) 

= E ^( x ) 

Xe$A(M,[f) 

= ]T Px(X) (21) 

XeA(M,!7') 

= f W (C/') 
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where (|2"TT > follows from Lemma [T7] Therefore, 



Moreover, 



P(x>(tf) = — 7^7 ■ (22) 



u-.vcu 

= E E pwm(^,v) 

r>s U:VCU.dim{U)=r 

E Prk(X)rk(Y){ r , s) ^ 
r>s WJq^sJq U:V<ZU,dxm(U)=r 



r>s 

P±(Y)(s) 
( T s)a 



(25) 

•.s/q 

where (|23]l and (|24| follow from Lemma [T2l in Appendix [A] Substituting C[8]l, d221> and (|25]l into /((X); (Y)), we 
have 



H(X);(Y)) 
= J2 P{x)(Y) (U,V)lo g *<™V>V) 



32 P { x)(U) P{Y) (V) 

dim(V)=s 

EY- / rMn , Prk(X) rk(r) (»", «) (s)g 

dim(V)=s 

Prk(X)rk(y)(^,s) (T)g 



EP rk (^) rk ( y )( S ' r ) lo s 



s < r 2 Prk(X)(r)Prk(Y){s) Qq 

T 

= J(rk(X);rk(Y)) + E Prk(x) rk(y) (r, a) log 2 ^. 

This completes the proof. ■ 

Proof of Corollary® Substituting the a-type input with p±(x)(M) = 1 in Theorem[8] we have I(ik(X); rk(Y)) = 
and J(rk(JT);rk(Y)) = J2 S p &(Y)\vk(x){s\M) log 2 4r. Given X e ¥ TxM with dimension M, 

P MY)lx (s\X.) = Pr{rk(XF) = s} = Pr{rk(ff) = a}. 
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Thus, P r k(Y)\±(x)( s \M) = Pr{rk(i?) = s}. Using the definition in (|34l >, we can write 



l0g 2 ^ = to; 



^2 M U ^2 j-M n Ms 

As s>s y 

c 

Since Cj < 1, 

where the last inequality follows from Lemma [13] in Appendix [B] So 



= (T-M)slog 2? + log 2 ^. 



log 2 )w < log 2 < 1.8, (26) 



J(rk(X);rk(y)) = ^> rk(i?) (s)(T - Af) s log 2 <z+ 

s 

2Z^(ff)( s ) lo g2 7fr 

= (T-M)log 2 gE[ik(Jf)]+e(T,<j), 

where e(T,q) = £ s i>rk(ir)0) 1o S2 |fr < L8 - The proof is completed by C SS (H,T) > J(rk(X);rk(y)). ■ 
Proof of Lemma U Fix a rk(X) x A/ matrix D with (X T ) = (D T ). Let B T = X T /D T . We know B has 
full column rank. Since X — > Y —> ik(Y) forms a Markov chain, 

P MY)lx (s\X) =Y / P±(y ) \y(s\Y)Py\x(Y\X) 

Y 

= J2 p Y\x(Y\X.) 

Y:rk(Y)=s 

= Pr{Di? = Y/B} (27) 

Y:rk(Y)=s 

= ^ Pr { D # = E l 

E:rk(E)=s 

= Pr{rk(DiJ) = s}, 

where (l27l i follows from (0. 

Let f7 = (X T ). By the Markov chain (X T ) ->• X ->• rk(Y), 

P rk(Y)\(XT)(s\U) 

= E fr k (Y)|x(s|XOPxKXT)(X'|C/) 
X':<X' T ) = C7 

= Pr{rk(Dff) = S } 2 P x K xT)(X'|i7) 

X':<X' T )=£/ 



Pr{rk(D#) = s}. 



The proof is completed. 
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VII. Constant-Rank Inputs for Subspace Degradations 

An input distribution with p-±(x){ r ) = 1 is called a constant-rank or rank-r input distribution. Note that for a 
subspace degradation, using rank-r input is corresponding to using r-dimensional subspace coding. Let 

CcMH,T)= max I((X);(Y)), 

px : constant- rank 

i.e., Ccss(H, T) is the maximum achievable rate of constant-dimensional subspace coding. The rank of a constant- 
rank input that achieves Ccss(H,T) is called the optimal input rank. 

We will prove the following theorem in Section IVII-CI by a way similar to proving Theorem H] 
Theorem 10: There exists a constant-rank a-type input that achieves Ccss(H,T) for any LOC. 
Theorem 11: For LOC(H,T) where H has dimension M x N, let 

U* = arg max g(U), 

C/GPj(min{M,T},F M ) 

where g(U) is defined in ( fT7T >. Then, r* = dim(f/*) is an optimal input rank and Ccss(H, T) = g(U*). Furthermore, 

C SS (H,T) - C C -ss(H,T) < max/(rk(X);rk(y)) < log 2 min{M, N, T}. 

Px 

Proof: See Section IVIFCl ■ 

A. Optimal Input Rank 
For LOC(H,T), define 

rk*(iJ) = max{r : Pr{rk(iJ) = r} > 0}. 

Lemma 6: Consider LOC(-ff, T) where H has dimension M x N and T > M. Fix an a-type input. For V < ¥ M 
with dim(V) = r < rk*(iJ), 

g(F M )~g(V)>e(T,r : H) \og 2 q, 

where 

Q(T, r, H) = (T- M)(rk*(H) - r)p rk(H) (rk* (H)) 
- r(M-r)+ log, £. 

Proof: See Section IVIFCl ■ 
Theorem 12: For LOC(-ff, T), there exists T such that when T > To, r* > rk*(H), where r* is the optimal 
input rank given in Theorem [TT] 

Proof of Theorem U2\ Suppose the dimension H is M x N. Fix To such that 0(Xb, r,H) > for all 
r < ik*(H). This is possible because Q(T, r, H) is a linearly increasing function of T for all r < rk*(iJ). Assume 
T > T a and r* < ik*(H). For any V < ¥ M with dim(V') < rk*(H) < M, by Lemma g] ,g(F M ) > g(V). Thus, 
we have a contradiction to r* < ik*(H). ■ 
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Theorem [T2l narrows down the range to search an optimal input rank for large T. The proof tells that there exists 
a To as 

T = min{T : 6(T, r, H) > 0, r < rk*(H)}. 



When rk* (H) = M, we can get that 



M — 1 — log C r r 

Prk(i?) (-M ) 



B. Optimality of Using Constant-Rank Inputs 

For subspace degradations, we have shown that the loss of rate by using only constant-dimensional subspace 
coding is upper bounded. In fact, using constant-rank is optimal for subspace degradations under the following 
constraints. 

Definition 5: A random matrix H with dimension M x N is regular if Prt/m ( s ) > for < s < M. Furthermore, 
LOC(iJ, T) is regular if H is regular. 

Theorem 13: Consider regular LOC(H, T) where H has dimension M x N. There exists T\ such that when T > 

T 

Ti,C S s is achieved by the a-type input with R(M) = 1. In this case C S s (H, T) = g(¥ M ) = Y, S PMH)( S ) 1o S2 pr = 



E 



log 2 f# 



Xrk(ff) 
M 

Proof: See Section IVIFCl 



C. Proofs 

Proof of Theorem [TU[ Consider a LOC with block length T. Let px (X) be an optimal constant-rank input 



with p rk{X )(r*) = 1. For $ e Fr(F TxT ), define as (X) = px($X). It is clear that p%, x (r*) = 1. By 



Lemma 12 (X) is also an optimal constant-rank input. Define p* x as 

1 V 71 $GFr(F T >< T ) 

By the concavity of the mutual information, p* x is also an optimal constant-rank input. Similar to the procedure in 
the proof of Theorem [4] we can check that p* x is an a-type distribution. ■ 
Proof of Theorem [771 For an r-dimensional a-type input, 

I((X);(Y))= ]T Q r (U)g(U) 

UeGr(r,¥ M ) 

< max g(U) 

UeGr(r,¥ M ) 

< 90*). 

Thus Cess < 9(U*)- On the other hand, for the r* -dimensional a-type input with P( X t){U*) — 1, Coss > 
I({X);(Y))=g(U*). 

Furthermore, for ana-type input I((X); {Y))~C C -ss = /(rk(X); rk(F))+ J(ikX; ikY)-g(U*) < /(rk(X);rk(F)). 
Thus, C ss - Cc-ss = max px:Q _ type I({X): (Y)) - C C - SS < max px /(rk(X);rk(F)) < log 2 min{Af, iV, T}. ■ 
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Proof of Lemma® Let U = ¥ M . Since V < U, there exists a full rank M x M matrix 

Di 

such that (D T ) = U and (D^) = V. By LemmaH 

E ^rk(T) | (xt) (s' | V") = Pr{rk(Dxif) > *}, 

s'>s 

and 

P A( r)|<xT)(s|^) = Pr{rk(DJ?) = S } 
= Pr{rk(i?) = s}. 

We know Pr{rk(iJ) > s} > Pr{rk(Di#) > s}. So 

E P rk(Y)|(XT)( S '|?7) > E ^)|<XT)(s'|y). 

Moreover, for 5 such that r < s < rk*(77), 



E P My)\(xt } ( S '\V)=0. 



:s'>s 



Thus, 



^s(P lk{Y )\(XT)(s\U) - P&(Y)\(XT)(s\V)) 
s 

= E E ( p My)\<xt)(s\U) - P M Y)\(XT } (S\V)) 



By definition, 



fe s:s>k 

> E E ^KJC^WCO 

fc:rk*(ff)>fc>r s:s>fc 

> E Pr{rk(i7) =rk*(iJ)} 

fc:rk*(ff)>fc>r 

= (rk*(iJ) - r) Pr{rk(#) = rk*(ff)}. (28) 

g(g) - g(v) 

log 2 g 



E P *(r)|<xT)(s|^) (V - M) s + log, 

- E P *Cni<* T > ({T - r)s + log, ^ 
(T - Af)E^^k(y)KxT)( S |^) - iV)|<xT)(s|V)) 
-(M-r)E^rk ( r)|<xT>(s|V) 
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+ Y, P MY)\(xT } (s\U) log q ^ 

s ^ s 

-Y, P MY)\(xT } ( S \V)\og q ^ 

> (T - M)(±*(H) - r) Pr{rk(i?) = ik*(H)} 
-r(M-r)+\og q C rl 
where the last inequality follows from (f28t . Therefore 

(M - r) £ sP rkW |<xT>(s|F) < r(M - r), 

s Ss 

and 

Y, P MY)\{X-)^\V)\0g q < ^P rk( y)| ( xT)( S |^)l0g 9 - < l0g ? -. 
s Ss s ^ s ^ r 

■ 

Proof of Theorem \13\ 

We treat Q r (X) and R(r) as the variables to maximize I({X); (Y)). By the KKT conditions, a set of necessary 
and sufficient conditions such that an a-type input with variables Q r (X) and R(r) to achieve Css(H,T) is that 

dQ r (U) 

Vr, *7 G Gr(r, F M ) : Q r (f7) > 0, (29a) 

dQr{U) 

Vr, C/ G Gr(r, F M ) : Q r (U) = 0, (29b) 
^ + ^ Or(l7)s(t0 = A 

(7eGr(r,F M ) 

Vr : R(r) > 0, (29c) 
d/(rk(AQ;rk(y)) ^ - - 

+ ^ Q r {U)g(U) < A 

&eGr(r,F M ) 



Vr : i?(r) = 0, (29d) 



where the partial derivatives are 

dI(tk(X);tk(Y)) 
dQAU) 



#0) 2^ ^rk(F)|<XT) (s|C/) log 2 — log 2 e, 
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and 



We can check that 



and 



<9I(rk(X);rk(Y)) 
dR{r) 

= 2^ Pmy)\±(x) (a r) log 2 — — 

, -n-k(r)(sj 



C SS (H,T) = A + log 2l 



log 2 e. 



A = ^ A r + (M - 1) log 2 e. 

r 

To prove the theorem, we only need to check that the a-type input with R(M) = 1 satisfies 029) . Conditions 
(|29al > and (|29bb with r < A/ are satisfied by A r = — log 2 e because R(r) = 0. Since Qm(^ M ) = 1, we check 
condition d29at with r = M. Since Pmy)\ mx) i s \M) = -frk(Y)( s )> 

<9I(rk(X);rk(F)) 



9Qm(F m ) 



= - log 2 e. 



fl(M) = l 



So, d29at with r = A/ is satisfied by Am = g{¥ M ) — log 2 e. This completes the verification of ( |29at and ( |29b| ). 

The above analysis also tells that A = Am- Now we check d29ct and ( |29dt with A = g(¥ M ) — log 2 e. Since 
R(M) — 1, condition ( I29cb should be satisfied with r = M. This is true since 



97(rk(X);rk(F)) 



R(M) = 1 



dR(M) 

Next, we check condition d29db for r < M. We know 

<9I(rk(X);rk(T)) 



-g(¥ 



log 2 e + g(¥ M ) 



dR{r) 



R(M) = 1 



2^ P rk(y)|rk(x)(s|r)log 2 - -log 2 e. 



-Prk(Y)|rk(X)(s|M) 



(A) 



Since 



we have 



P*(Y)\rk(X)(s\M) = P rk(y) | (X T ) (.s|F M ) 

= Pr{rk(DiJ) = s} 
= Pr{rk(#) = s}, 



( A ) < J2 P MY)\A(x)( s \ r ) log 2 



J2P±(Y)\&(x)(s\r) log 



P±(Y)\vk(x)(s\M) 
1 



2 Prk(H)(«) 



< -log 2o min M P rk(ff) (s). 
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That is 



<9/(rkpO;rk(T)) 



dR{r) 



< - log 2 Q min fftk(H) (s) - log 2 e. 



K(M) = 1 

Fix Ti such that Q(Ti,r, H) > — log 2 mino< s <M Pa(h) ( s ) f° r a U r < M- This is possible because <d(T,r,H) 
is linearly increasing with T and — log 2 mino< s <MPrk(ff)(s) does not change with T. By Lemma [6] g(F M ) > 
g[U) - log 2 min < s <j l /p rk (i/)(s) for all U E Gr(r,F M ). Thus 

A = .g(F M )-log 2 e 
> max g(U) - log 2 mm p lk (H ) (s) - log 2 e 

(7GGr(r,F M ) 0<s<M 



> E Qr0)g(u) 

UeGr(r,F M ) 

Hence, condition ( |29dt with r < M is satisfied. 



97(rk(X);rk(F)) 



<9i?(r) 



R(M) = 1 



VIII. More about Uniform LOCs 

A. Alternative Definition 

We have the following alternative definition of uniform LOCs, where the equivalence follows from the symmetric 
property in Theorem [TJ (see proof in Lemma |7). 

Definition 6 (Alternative definition of uniform LOCs): A LOC (H,T) is uniform if there exists a function /i rk : 
Z+xZM [0 1] such that 

' /irk(rk(X),rk(Y)) (Y) C (X) 
o.w. 

Lemma 7: Consider a uniform LOC (H, T) and the function fi defined in Definition [3] For V < U < F T and 
V < U' < ¥ T with dim(V') = dim(V) and dim(£/') = dim([/), fJ,(U',V) = n{U,V). 

Proof: Let dim(V') = s and dim(£7) = r. Find matrices B and B' with T x s such that (B) = V and 



Pr{Y = X#} 



(B'} = V'. Furthermore, there exits X 



B C 



and X' 



B' C 



such that (X) = U and (X') = U'. There 



exits $ e Fr(F TxT ) such that <E>B = B' and $X = X' (ref. the proof of Lemma[T6]in Appendix|B]). Thus, 

P{Y)lx (V\X)= P Y\x(Y\X) 

Y:(Y)=V 

= J2 ^V|x(*Y|$X) 

Y:(Y)=V 

= £ Py,x(Y'|X') 

Y':(Y')=$V 



(30) 



P(Y)|X(V'|X'), 
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where (|30i > follows from Theorem Q] On the other hand, 



and similarly 



p {Y}l x(V\X.) = Yl P Y\X( Y W 
Y:(Y} = V 

= E ti u > v ) 

Y:(Y)=V 



p {Y)]x (V\X) = X ?»(U',V')- 



Therefore, fi(U', V) = p,(U, V). ■ 

B. Simplification for Uniform LOCs 

For uniform LOCs, the discussion in Section |VT] and IVHI can be further simplified. For a uniform LOC, let 
X e F TxM with (X T ) = U. By Lemma g] 

P &(Y)\{XT){s\U) = P rk(F) | X (s|X) 

E ^|x(Y|X) 

Y:(Y}c(X),rk(Y)=s 

= E pt r k(dim(C/),s) 

Y:<Y)c<X),rk(Y)=s 

'dim(J7)\ 

/i rk (dim([7), s) 

' i 

/}(dim(f/), s). (31) 
Using d3~TT ), we can rewrite < fl3T > as 



A ~ 



frk(Y)|rk(X)(sk) = E P rk(Y)|(XT)(s|C/)<3 r (L0 
(7eGr(r,F M ) 

= E Kr,s)Q r (U) 

UeGi(r,¥ M ) 



So Z(rk(Jf);rk(F)) in (0 is only a function of i?(r). Let 



.g(r)=E^< s ) lo g2 4 
Using OTb . we have g(U) defined in ( TTTb satisfying = <?(rk(t/)). 

J(rk(X);rk(y))-E il, ( r )-9( r 
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Thus, both 7(rk(X);rk(F)) and J(rk(X);rk(F)) are only a function of R(r). So to maximize I((X); (Y)) of a 
uniform LOC, we only need to consider the rank distribution of an a-type input and the distribution Q r can be 
arbitrarily chosen. 

Other results in Section |VI] and IVIII can be accordingly simplified and we do not repeat the procedure here. 



C. Proof of Theorem [6] 

Proof of i). Fix X S F TxM with dim(X) = M. The existence of such X follows from T > M. Thus, for any 
Y E ¥ TxN , we have unique H such that Y = XH. So 

p H (H) = Pr{Y = XH} 

= Mrk(rk(X),rk(Y)) 
= /i rk (M,rk(H)), 

where /i^ is given by the alternative definition of uniform LOCs. Therefore H is uniform. 

Proof of ii). Let X E ¥ TxM and Y E ¥ TxN with rk(X) = r, rk(Y) = s and (Y) C (X). Fix a full rank 
decomposition X = BD and write Y = BE. By Prop. Q] we have 

iV|x(Y|X) =Pr{Dff = E}. 

Let A — {H E ¥ MxN : DH = E}. For any * C ¥ MxN , let 

N^,(k) = |{H E * : rk(H) = fc}|. 

Since H is uniform, we have 



Prk(g)(fc) 
M.N ' 



(32) 



To finish the proof, we need to determine A^(fc). 

Without loss of generality, we assume that the first r columns of D are linearly independent. Write D = [Di D2]. 
We have DH = E, where 



H 



— 



D- : E 




We know rk(H ) = rk(E) = s. Find $ E Fr(¥ NxN ) such that 

H 00 







H„ 



where Hoo is an r x s full rank matrix and Ho is the first r columns of Hq^. Since rk(D) = r, the null space of 
D, defined as 



Null(D) = {h E ¥ MX1 : Dh = 0} 
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has dimension M — r. Let gj, i = 1, 2, • • • , (M — r), be a basis of the null space of X, and let G be the matrix 
formed by juxtaposing the vectors in the basis. Thus, 

A = H Q + G¥^ M - r '> xN . 

Since rk(H) = rk(H$) for all H G W MxN , 

N A (k) = N A *(k), 

where 

A$ = H $ + GF( M - r ) xAf . 
Now we study the rank of H = H $ + GF for F G F( M - r ' XJV . We know 

H= H + GFi GF 2 , 

where F = [Fi F 2 ], F x G F( M ~ r ) xs and F 2 G w(M-r)x(N- 8 )^ We show (H ) n (G) = 0. If there exists nonzero 
h G (Ho) fi (G), by the structure of Ho, we have 

h 

h = 



Since h is in the null space of D, = Dh = Dih. Because Di is full rank, we have h = 0. This is a contradiction 
to h ^ 0. Therefore, (H > n (G) = 0. Hence, (H ) n (G) = 0, which implies rk(H + GFi) = s. Thus, 

rk(H) = rk(H + GFi) + rk(GF 2 ) 

= s + rk(F 2 ). 

N A ${k) = \¥^ M -^ xs \N w( M-ru ( N- a) {k - s) 



This gives that 



(M-r),{N-s) (M-r)s 
Afc-s H 



Taking N A (k) = X y k 



Let 



(M-r),(JV-s) 



into (I321 l. we have 
P Y \x(Y\X.) =J2PMH)(k) 



k>s 



(M-r),(N-s) (M-r)s 

M,N 

Xs ' 



fi(r. 



k>s 



(M-r),(N-s) (Af-r)s 

Afc— s y 

M.N 

Xs ' 



Therefore, (H, T) is uniform. The proof is complete. 

IX. Concluding Remarks 

Linear operator channels with arbitrarily distributed transfer matrices are studied. One important guideline we 
obtained here is that using constant-dimensional subspace coding suffices if we want to use subspace coding for 
LOCs. We give the method to find the subspaces for constant-dimensional subspace coding. When the packet length 
is short, encoding/decoding techniques for LOCs still need further investigations. 
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Appendix A 
Counting 

Parts of the counting problems here can be found in various sources, e.g., Ifl4l and reference therein. Here we 
give the self-contained proofs. 

The projective space Pj(F') is the collection of all subspaces of F*. Let Pj(m, F*) be the subset of Pj(F') that 
contains all the subspaces with dimension less than or equal to m. Let Fr(F mxr ) be the set of full rank matrices 
in F mxr . Define 

m f (q m -l)(q m -q)---(q m -q r - 1 ) r >0 
1 1 r = 

for r < m. 

Lemma 8: When < r < m, Fr(F mxr )| = xT 

Proof: The lemma is trivial for r = 0, so we consider r > 0. We can count the number of full rank matrices 
in F mxr by the columns. For the first column, we can choose all vectors in F m except the zero vector. Thus we 
have q rn — 1 choices. Fixed the first column, say Vx, we want to choose the second column v-i in F m but is linear 
independent with v\. Hence, we have q rn — q choices of «2- Repeat this process, we obtain that the number of full 
rank m x r matrices is (q m — l)(q m —<?)••• (q m — <Z r_1 ) = X™- ■ 

Define 

C? = X?Q~ mr - (34) 

Since the number ofmxr matrices is q mr , £™ 1 can be regarded as the probability that a randomly chosen m x r 
matrix is full rank. 

Lemma 9: Let G be an s x m random matrix with uniformly independent components over F. Then for r < m, 

P&{GH)\±(H)(s\r) = Cr, 

where H is any m x n random matrix. 

Proof: Fix an in x n matrix H with rk(H) = r. Let F = GH and let and /, be the ith row of G and F, 
respectively. Since gi contains uniformly independent components, Pr{g; = g} = q~~ m . For f with f T G (H T ), 

Pr{« ?l H = f} = g -™|Ker(H)| = (? - r , 

where Ker(H) = {g : g H = 0} and |Ker(H)| = g ™- rk ( H ), So for F with (F T ) < (H T ), 

p GH]H (F\K) = Pr{ 9i H = fi,i = l,...,8} 

S 

= []Pr{. gi H = fJ 

8=1 

= q- sr - (35) 
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Thus, 

P±(GH)\ H (s\n) = q~ mr \{F : <F T > < (H T ),rk(F) = s}\ 

= q- mr X r s 

= c 

where |{F : (F T ) < (H T ),rk(F) = s}\ = xl follows from Lemma[8]in Appendix lAl Last, since rk(iJ) — > H -> 
rk(GH) forms a Markov chain, 

PA(GH)\MH)(s\r) = ^ Prk(G//)|//( s l H W|rk(H)(H|r) 
H:rk(H)=r 

= C PH\vk(H)(H.\r) 
H:rk(H)=r 

= c 

The proof is complete. ■ 

The Grassmannian Gr(r, F*) is the set of all r-dimensional subspaces of F*. Thus Pj(m,F*) = [J r<m Gr(r, F*). 
The Gaussian binomials are defined as 

V'' 




Lemma 10: The number of r-dimensional subspace in F m is given by the Gaussian binomials. 

Proof: Define an equivalent relation on _M(F mxr ) by X — X' if (X) = (X'). The equivalent class [X] is the 
set of all matrices that equivalent to X. We have [X] = {X$ : $ e M(¥ rxr )}. Thus |[X]| = |X(F rxr )| = X r r - 
Since Gr(r,F T ) =M(F mxr )/ - the quotient set of M(¥ mxr ) by ~, we have | Gr(r,F T )| = |X(F mxr )|/|[X]| = 

Let 

m n 

xT' n = (36) 

Xr 

which is the number ofmxn matrices with rank r. 

Lemma 11: For m > r' and r > r', define a set S = {X e ]p mxr : rk(X) = r'}. Then 

\S\ = = x™' r - (37) 

Furthermore, 

E^'-9 mr . (38) 

r' 

Proof: The column vectors of X £ S span an r'-dimensional subspace in an m-dimensional vector space. Let 
{Vi, V2, ■ ■ • V n } be the set of r'-dimensional subspace in an m-dimensional vector space, where n = (™) g . Let 
S Vl ={Xe F mxr : (X) = Vi} and the set {SVJ is a partition of S. We know that \{S Vl }\ = Xr<- Therefore, 

I X r r ,=X r V- (39) 
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The equality in d38l > follows because both sides are the number of m x r matrices. ■ 
Lemma 12: Let V < ¥ m be an s-dimensional subspace. Then, the number of subspace U with V < U and 

dim(L) = r is 

= % (40) 

\ r I 

q \ / q 

Proof: Let U be a subspace with V < U and dim(L) = r. Then we can write U = V + U' where U' is a 
dim(Z7') = r — s and V R U' = {0}. Given U, such U' is unique. The number of U' is the number of (r — s)- 
dimensional subspace in an (m — s)-dimensional space, i.e., \?Zf) • The equality in d40b is the direct result of the 
definitions. ■ 

Appendix B 
Useful Results 

Define 

oo 

Z q (s) = U(l-q- i ). (41) 

i=s 

Lemma 13: For r < m, C" 1 > S 2 (l) > 0.2887. 

Proof: We know S g (s + 1) > 3 g (s) > H g _i(s) > H2(l), where S2(l) is a mathematics constant with 
approximate value 0.28879 |E). So C"' > - r + 1) > s 2(l)- ■ 

Lemma 74: (™) ? < cg( m - r ) r , where c w 3.4627. 

Proo/- By definition, = g(™- r ) r ^ < g (m ~ r)r s ^ T y = c^™-^, where c = 1/H 2 (1) « 1/0.28879. ■ 
Lemma 75: For V < L < F T and V <U' < ¥ T with dim(L) = dim(L') and dim(V) = dim(V'), there exits 
$ e Fr(F TxT ) such that $L = U' and $y = V. 

Proof: Find a basis {b.^ : i = 1, • • • ,T} of F T such that {b; : i = 1, • • • , r} is a basis of £/ and {b,; : i = 
1, • • • , s} is a basis of V. We can do this by first finding a basis of V, extending the basis to a basis of U and 
further extending to a basis of F T . Similarly, find a basis {ty : i — 1, ■ ■ ■ , T} of F T such that {ty : % = 1, • • ■ , r} 
is a basis of U and {b^ : i = 1, • • • , s} is a basis of V. Consider the linear system of equations 

$b 4 = b^, * = 1,...,T. 

We know that there exists a unique $ 6 Fr(F TxT ) satisfying this linear system and = V and $L = U'. ■ 
Lemma 16: For X, X' G F TxM , (X T ) = (X /T > if and only if there exists $ e Fr(F TxT ) such that X' = $X. 
Proof: Let r = rk(X). First, show a) =$> c). Fix one full-rank decomposition X = BD. Since (D T ) = (X T ) = 
(X' T ), there exists a decomposition X' = B'D using the same procedure we described by first fixing D. Second, 
show c) b). With the decomposition in c), there exists $ G Fr(F TxT ) such that <1>B = B'. Extend B and 
B' to T x T matrices [B B ] and [B' B' ]. Then, $ = [B' B„][B B ] _1 is one such matrix we want since 
$[B B ] = [B' B' ]. Last, we have b) => a). ■ 
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Lemma 17: For U < F* with dim(C/) — r < m, let 

A(m, (7) = {Xe F txm : (X) = 17}. 

Then, 

|A(m,t/)|=x™, 

and for $ e Fr(F* xt ) 

A(m,$U) = $A(m,U). 

Proof: Find aixr matrix B with (B) = U. Then, we have 

A(m,U) = {BD : D e Fr(F rxm )} = B Fr(F rxm ). 

Thus, \A(m,U)\ = |Fr(F' rxm )| = X r- For * £ Fr(F* xt ), ($B) = $Z7. So A(m, $C7) = $BFr(F rxM ) = 
$A(m, f7). ■ 
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